85 research outputs found

    Heterogeneous mission planning for a single unmanned aerial vehicle (UAV) with attention-based deep reinforcement learning

    Get PDF
    Large-scale and complex mission environments require unmanned aerial vehicles (UAVs) to deal with various types of missions while considering their operational and dynamic constraints. This article proposes a deep learning-based heterogeneous mission planning algorithm for a single UAV. We first formulate a heterogeneous mission planning problem as a vehicle routing problem (VRP). Then, we solve this by using an attention-based deep reinforcement learning approach. Attention-based neural networks are utilized as they have powerful computational efficiency in processing the sequence data for the VRP. For the input to the attention-based neural networks, the unified feature representation on heterogeneous missions is introduced, which encodes different types of missions into the same-sized vectors. In addition, a masking strategy is introduced to be able to consider the resource constraint (e.g., flight time) of the UAV. Simulation results show that the proposed approach has significantly faster computation time than that of other baseline algorithms while maintaining a relatively good performance

    Neural Architectural Nonlinear Pre-Processing for mmWave Radar-based Human Gesture Perception

    Full text link
    In modern on-driving computing environments, many sensors are used for context-aware applications. This paper utilizes two deep learning models, U-Net and EfficientNet, which consist of a convolutional neural network (CNN), to detect hand gestures and remove noise in the Range Doppler Map image that was measured through a millimeter-wave (mmWave) radar. To improve the performance of classification, accurate pre-processing algorithms are essential. Therefore, a novel pre-processing approach to denoise images before entering the first deep learning model stage increases the accuracy of classification. Thus, this paper proposes a deep neural network based high-performance nonlinear pre-processing method.Comment: 4 pages, 7 figure

    Encoder-decoder multimodal speaker change detection

    Full text link
    The task of speaker change detection (SCD), which detects points where speakers change in an input, is essential for several applications. Several studies solved the SCD task using audio inputs only and have shown limited performance. Recently, multimodal SCD (MMSCD) models, which utilise text modality in addition to audio, have shown improved performance. In this study, the proposed model are built upon two main proposals, a novel mechanism for modality fusion and the adoption of a encoder-decoder architecture. Different to previous MMSCD works that extract speaker embeddings from extremely short audio segments, aligned to a single word, we use a speaker embedding extracted from 1.5s. A transformer decoder layer further improves the performance of an encoder-only MMSCD model. The proposed model achieves state-of-the-art results among studies that report SCD performance and is also on par with recent work that combines SCD with automatic speech recognition via human transcription.Comment: 5 pages, accepted for presentation at INTERSPEECH 202

    Proposed Protocols for Artificial Intelligence Imaging Database in Acute Stroke Imaging

    Get PDF
    Purpose To propose standardized and feasible imaging protocols for constructing artificial intelligence (AI) database in acute stroke by assessing the current practice at tertiary hospitals in South Korea and reviewing evolving AI models. Materials and Methods A nationwide survey on acute stroke imaging protocols was conducted using an electronic questionnaire sent to 43 registered tertiary hospitals between April and May 2021. Imaging protocols for endovascular thrombectomy (EVT) in the early and late time windows and during follow-up were assessed. Clinical applications of AI techniques in stroke imaging and required sequences for developing AI models were reviewed. Standardized and feasible imaging protocols for data curation in acute stroke were proposed. Results There was considerable heterogeneity in the imaging protocols for EVT candidates in the early and late time windows and posterior circulation stroke. Computed tomography (CT)-based protocols were adopted by 70% (30/43), and acquisition of noncontrast CT, CT angiography and CT perfusion in a single session was most commonly performed (47%, 14/30) with the preference of multiphase (70%, 21/30) over single phase CT angiography. More hospitals performed magnetic resonance imaging (MRI)-based protocols or additional MRI sequences in a late time window and posterior circulation stroke. Diffusion-weighted imaging (DWI) and fluid-attenuated inversion recovery (FLAIR) were most commonly performed MRI sequences with considerable variation in performing other MRI sequences. AI models for diagnostic purposes required noncontrast CT, CT angiography and DWI while FLAIR, dynamic susceptibility contrast perfusion, and T1-weighted imaging (T1WI) were additionally required for prognostic AI models. Conclusion Given considerable heterogeneity in acute stroke imaging protocols at tertiary hospitals in South Korea, standardized and feasible imaging protocols are required for constructing AI database in acute stroke. The essential sequences may be noncontrast CT, DWI, CT/MR angiography and CT/MR perfusion while FLAIR and T1WI may be additionally required

    Honeycomb oxide heterostructure: a new platform for Kitaev quantum spin liquid

    Full text link
    Kitaev quantum spin liquid, massively quantum entangled states, is so scarce in nature that searching for new candidate systems remains a great challenge. Honeycomb heterostructure could be a promising route to realize and utilize such an exotic quantum phase by providing additional controllability of Hamiltonian and device compatibility, respectively. Here, we provide epitaxial honeycomb oxide thin film Na3Co2SbO6, a candidate of Kitaev quantum spin liquid proposed recently. We found a spin glass and antiferromagnetic ground states depending on Na stoichiometry, signifying not only the importance of Na vacancy control but also strong frustration in Na3Co2SbO6. Despite its classical ground state, the field-dependent magnetic susceptibility shows remarkable scaling collapse with a single critical exponent, which can be interpreted as evidence of quantum criticality. Its electronic ground state and derived spin Hamiltonian from spectroscopies are consistent with the predicted Kitaev model. Our work provides a unique route to the realization and utilization of Kitaev quantum spin liquid

    Deep-Learning-Based Algorithm for the Removal of Electromagnetic Interference Noise in Photoacoustic Endoscopic Image Processing

    Get PDF
    Despite all the expectations for photoacoustic endoscopy (PAE), there are still several technical issues that must be resolved before the technique can be successfully translated into clinics. Among these, electromagnetic interference (EMI) noise, in addition to the limited signal-to-noise ratio (SNR), have hindered the rapid development of related technologies. Unlike endoscopic ultrasound, in which the SNR can be increased by simply applying a higher pulsing voltage, there is a fundamental limitation in leveraging the SNR of PAE signals because they are mostly determined by the optical pulse energy applied, which must be within the safety limits. Moreover, a typical PAE hardware situation requires a wide separation between the ultrasonic sensor and the amplifier, meaning that it is not easy to build an ideal PAE system that would be unaffected by EMI noise. With the intention of expediting the progress of related research, in this study, we investigated the feasibility of deep-learning-based EMI noise removal involved in PAE image processing. In particular, we selected four fully convolutional neural network architectures, U-Net, Segnet, FCN-16s, and FCN-8s, and observed that a modified U-Net architecture outperformed the other architectures in the EMI noise removal. Classical filter methods were also compared to confirm the superiority of the deep-learning-based approach. Still, it was by the U-Net architecture that we were able to successfully produce a denoised 3D vasculature map that could even depict the mesh-like capillary networks distributed in the wall of a rat colorectum. As the development of a low-cost laser diode or LED-based photoacoustic tomography (PAT) system is now emerging as one of the important topics in PAT, we expect that the presented AI strategy for the removal of EMI noise could be broadly applicable to many areas of PAT, in which the ability to apply a hardware-based prevention method is limited and thus EMI noise appears more prominently due to poor SNR

    Dynamic behavior and degradation of a fuel cell system

    No full text
    In this dissertation, the effect of dynamic operation of a Ballard Nexa 1.2kw fuel cell system is investigated. Three specific topics are considered: the first is an analysis of the dynamic behavior of the fuel cell system, the second is an evaluation and examination of fuel cell membrane degradation during dynamic operation and the last is numerical simulation to predict the transient response in the cell voltage. To enable the analysis of the fuel cell system’s dynamic behavior, a simple method for analyzing the system’s voltage response to a step change in load resistance is presented. A modified Randles model is used as the system model, where two resistors and two capacitors are implemented for the Warburg impedance. Using that model, the response is fitted with three exponential curves. Six independent equations corresponding to six parameters of the model can be solved using the fitted values, under a specific assumption for the initial state. The impedance is also simulated using the estimated parameters. Cyclic operation is thought to have a negative impact on fuel cell lifetime. The frequency effect of the cyclic operation on chemical degradation is investigated. After calculating each parameter value through exponential curve fitting, the dynamic behaviors of the three resistor-capacitor pairs are simulated using MATLAB Simulink®. In addition, fluoride release as the change of the frequency of cyclic operation is evaluated by measuring the concentration of fluoride ion in effluent of a fuel cell. The frequency effect on chemical degradation is explained by comparing the simulated result and the fluoride release result. Finally, a single-phase numerical model to predict the transients in voltage of a PEMFC is presented. A new approach is developed by classifying the current density by two groups; charging currents which are accumulated in the interfaces where the reaction occurs, and faradaic currents which are charge transfer currents. The successive change of the activation overpotential is calculated by using the charging currents and the element law for an ideal linear capacitor, and then the transient voltage response to a step load change is shown in results

    Heterogeneous Mission Planning of UAVs with Attention-Based Deep Reinforcement Learning

    No full text
    Department of Mechanical EngineeringThis paper proposes a deep learning-based mission planning algorithm for UAVs, which deals with heterogeneous types of missions. We formulate the heterogeneous mission planning problem as a vehicle routing problem. Then, we solve the problem using an attention-based reinforcement learning approach with its fast computation time and flexibility. We introduce the unified mission representation on heterogeneous missions and the action masking strategy to utilize an attention-based neural network for heterogeneous mission planning. The proposed algorithm is compared with state-of-the-art heuristic algorithms about the performance and the computation time in numerical simulation. Simulation results show that the proposed approach has significantly faster computation time than other baseline algorithms while maintaining good performance. The ablation study provides the effectiveness of the unified mission representation.ope

    Decentralized Multi-robot Task Allocation with Attention-Based Deep Reinforcement Learning Algorithm

    No full text

    Run Your 3D Object Detector on NVIDIA Jetson Platforms:A Benchmark Analysis

    No full text
    This paper presents a benchmark analysis of NVIDIA Jetson platforms when operating deep learning-based 3D object detection frameworks. Three-dimensional (3D) object detection could be highly beneficial for the autonomous navigation of robotic platforms, such as autonomous vehicles, robots, and drones. Since the function provides one-shot inference that extracts 3D positions with depth information and the heading direction of neighboring objects, robots can generate a reliable path to navigate without collision. To enable the smooth functioning of 3D object detection, several approaches have been developed to build detectors using deep learning for fast and accurate inference. In this paper, we investigate 3D object detectors and analyze their performance on the NVIDIA Jetson series that contain an onboard graphical processing unit (GPU) for deep learning computation. Since robotic platforms often require real-time control to avoid dynamic obstacles, onboard processing with a built-in computer is an emerging trend. The Jetson series satisfies such requirements with a compact board size and suitable computational performance for autonomous navigation. However, a proper benchmark that analyzes the Jetson for a computationally expensive task, such as point cloud processing, has not yet been extensively studied. In order to examine the Jetson series for such expensive tasks, we tested the performance of all commercially available boards (i.e., Nano, TX2, NX, and AGX) with state-of-the-art 3D object detectors. We also evaluated the effect of the TensorRT library to optimize a deep learning model for faster inference and lower resource utilization on the Jetson platforms. We present benchmark results in terms of three metrics, including detection accuracy, frame per second (FPS), and resource usage with power consumption. From the experiments, we observe that all Jetson boards, on average, consume over 80% of GPU resources. Moreover, TensorRT could remarkably increase inference speed (i.e., four times faster) and reduce the central processing unit (CPU) and memory consumption in half. By analyzing such metrics in detail, we establish research foundations on edge device-based 3D object detection for the efficient operation of various robotic applications
    corecore